Search CORE

2,617 research outputs found

Integrating sequence and structural biology with DAS.

Author: A Andreeva
A Kryshtafovych
A Prlić
A Zemla
Andreas Kähäri
Andreas Prlić
B Coessens
D Hull
DL Wheeler
Eugene Kulesha
H Sugawara
I Cases
IN Shindyalov
J Moult
JM Fernández
JR Macías
L Holm
L Käll
L Stein
LD Stein
M Clamp
M Tagari
MD Wilkinson
NJ Mulder
P Jones
P Lackner
PI Olason
PN Seibel
R Bose
RD Dowell
RD Finn
RD Finn
Robert D Finn
RT Fielding
S Pillai
Thomas A Down
Tim JP Hubbard
TJP Hubbard
Publication venue: BMC Bioinformatics
Publication date: 12/09/2007
Field of study

BACKGROUND: The Distributed Annotation System (DAS) is a network protocol for exchanging biological data. It is frequently used to share annotations of genomes and protein sequence. RESULTS: Here we present several extensions to the current DAS 1.5 protocol. These provide new commands to share alignments, three dimensional molecular structure data, add the possibility for registration and discovery of DAS servers, and provide a convention how to provide different types of data plots. We present examples of web sites and applications that use the new extensions. We operate a public registry of DAS sources, which now includes entries for more than 250 distinct sources. CONCLUSION: Our DAS extensions are essential for the management of the growing number of services and exchange of diverse biological data sets. In addition the extensions allow new types of applications to be developed and scientific questions to be addressed. The registry of DAS sources is available at http://www.dasregistry.org.RIGHTS : This article is licensed under the BioMed Central licence at http://www.biomedcentral.com/about/license which is similar to the 'Creative Commons Attribution Licence'. In brief you may : copy, distribute, and display the work; make derivative works; or make commercial use of the work - under the following conditions: the original author must be given credit; for any reuse or distribution, it must be made clear to others what the license terms of this work are

Crossref

Springer - Publisher Connector

PubMed Central

Apollo (Cambridge)

King's Research Portal

Systems Metagenomics:Applying Systems Biology Thinking to Human Microbiome Analysis

Author: E Boutet
FP Breitwieser
JT Simpson
M Nei
PJ Turnbaugh
RC Edgar
RD Finn
RD Isokpehi
TK Attwood
Y Sanz
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 24/08/2018
Field of study

Crossref

Royal Holloway - Pure

Predicting active site residue annotations in the Pfam database

Author: A Ben-Shimon
A Gutteridge
AH Elcock
AH Liu
Alex Bateman
AR Panchenko
BM Beadle
CG Nevill-Manning
CH Wu
CT Porter
D La
EL Sonnhammer
H Yao
H Yao
I Letunic
Jaina Mistry
KC Chou
KM Mayer
M Ota
MJ Zvelebil
N Hulo
ND Rawlings
NJ Mulder
NV Petrova
O Lichtarge
P Aloy
P Puntervoll
PD Dobson
R Greaves
RD Finn
RD Finn
Robert D Finn
S Velankar
SR Eddy
W Tian
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background Approximately 5% of Pfam families are enzymatic, but only a small fraction of the sequences within these families (<0.5%) have had the residues responsible for catalysis determined. To increase the active site annotations in the Pfam database, we have developed a strict set of rules, chosen to reduce the rate of false positives, which enable the transfer of experimentally determined active site residue data to other sequences within the same Pfam family. Description We have created a large database of predicted active site residues. On comparing our active site predictions to those found in UniProtKB, Catalytic Site Atlas, PROSITE and <it>MEROPS </it>we find that we make many novel predictions. On investigating the small subset of predictions made by these databases that are not predicted by us, we found these sequences did not meet our strict criteria for prediction. We assessed the sensitivity and specificity of our methodology and estimate that only 3% of our predicted sequences are false positives. Conclusion We have predicted 606110 active site residues, of which 94% are not found in UniProtKB, and have increased the active site annotations in Pfam by more than 200 fold. Although implemented for Pfam, the tool we have developed for transferring the data can be applied to any alignment with associated experimental active site data and is available for download. Our active site predictions are re-calculated at each Pfam release to ensure they are comprehensive and up to date. They provide one of the largest available databases of active site annotation.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Integrating modes of policy analysis and strategic management practice : requisite elements and dilemmas

Author: Ackermann F
Andersen DF
Andersen DF
Argyris C
Argyris C
Bardach EA
Behn RD
Berger PL
Blenkin GM
Bryson JM
Bryson JM
Bryson JM
Bryson JM
C B Finn
C Eden
Crosby BC
D F Andersen
Eden C
Eden C
Eden C
Eden C
Eden C
Eisenhardt KM
F Ackermann
Finn C
Forrester J
G P Richardson
Holman P
Huxham C
J M Bryson
Janis IL
Kolb DA
Milling PM
Pettigrew A
Phillips R
Richardson G
Richardson G
Schweiger DM
Senge P
Sterman JD
Warren KD
Weimar D
Wildavsky A
Zagonel AA
Zagonel AA
Zagonel AA
Zahn EKO
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

There is a need to bring methods to bear on public problems that are inclusive, analytic, and quick. This paper describes the efforts of three pairs of academics working from three different though complementary theoretical foundations and intervention backgrounds (i.e., ways of working) who set out together to meet this challenge. Each of the three pairs had conducted dozens of interventions that had been regarded as successful or very successful by the client groups in dealing with complex policy and strategic problems. One approach focused on leadership issues and stakeholders, another on negotiating competitive strategic intent with attention to stakeholder responses, and the third on analysis of feedback ramifications in developing policies. This paper describes the 10 year longitudinal research project designed to address the above challenge. The important outcomes are reported: the requisite elements of a general integrated approach and the enduring puzzles and tensions that arose from seeking to design a wide-ranging multi-method approach

CiteSeerX

Crossref

University of Strathclyde Institutional Repository

espace@Curtin

DODO: an efficient orthologous genes assignment tool based on domain architectures. Domain based ortholog detection

Author: A Kuzniar
C Vogel
CE Storm
CE Storm
CM Zmasek
EV Kriventseva
EW Sayers
F Delsuc
G Ostlund
M Ashburner
M Bashton
M Levitt
M Pellegrini
M Remm
R Jothi
RD Finn
RD Finn
RL Tatusov
RT van der Heijden
Timothy H Wu
Ting-wen Chen
TJ Hubbard
Wailap V Ng
Wen-chang Lin
WM Fitch
WM Fitch
Z Fu
Z Fu
Publication venue: BioMed Central
Publication date: 01/10/2010
Field of study

Abstract Background Orthologs are genes derived from the same ancestor gene loci after speciation events. Orthologous proteins usually have similar sequences and perform comparable biological functions. Therefore, ortholog identification is useful in annotations of newly sequenced genomes. With rapidly increasing number of sequenced genomes, constructing or updating ortholog relationship between all genomes requires lots of effort and computation time. In addition, elucidating ortholog relationships between distantly related genomes is challenging because of the lower sequence similarity. Therefore, an efficient ortholog detection method that can deal with large number of distantly related genomes is desired. Results An efficient ortholog detection pipeline DODO (DOmain based Detection of Orthologs) is created on the basis of domain architectures in this study. Supported by domain composition, which usually directly related with protein function, DODO could facilitate orthologs detection across distantly related genomes. DODO works in two main steps. Starting from domain information, it first assigns protein groups according to their domain architectures and further identifies orthologs within those groups with much reduced complexity. Here DODO is shown to detect orthologs between two genomes in considerably shorter period of time than traditional methods of reciprocal best hits and it is more significant when analyzed a large number of genomes. The output results of DODO are highly comparable with other known ortholog databases. Conclusions DODO provides a new efficient pipeline for detection of orthologs in a large number of genomes. In addition, a database established with DODO is also easier to maintain and could be updated relatively effortlessly. The pipeline of DODO could be downloaded from <url>http://140.109.42.19:16080/dodo_web/home.htm</url></p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

CODA: Accurate Detection of Functional Associations between Proteins in Eukaryotic Genomes Using Domain Fusion

Author: Adam J. Reid
AJ Enright
AJ Enright
Andrew B. Clegg
B Snel
C von Mering
C Yeats
Christine A. Orengo
CJ Marcotte
DE Barnes
EM Marcotte
F Bellivier
G Apic
I Yanai
Juan A. G. Ranea
K Truong
M Huynen
Magnus Rattray
P Resnik
PM Bowers
PW Lord
RD Finn
RD Finn
S Hoffman
SF Altschul
SK Kummerfeld
TF Smith
Publication venue: Public Library of Science
Publication date: 01/01/2009
Field of study

Background: In order to understand how biological systems function it is necessary to determine the interactions and associations between proteins. Gene fusion prediction is one approach to detection of such functional relationships. Its use is however known to be problematic in higher eukaryotic genomes due to the presence of large homologous domain families. Here we introduce CODA (Co-Occurrence of Domains Analysis), a method to predict functional associations based on the gene fusion idiom.Methodology/Principal Findings: We apply a novel scoring scheme which takes account of the genome-specific size of homologous domain families involved in fusion to improve accuracy in predicting functional associations. We show that CODA is able to accurately predict functional similarities in human with comparison to state-of-the-art methods and show that different methods can be complementary. CODA is used to produce evidence that a currently uncharacterised human protein may be involved in pathways related to depression and that another is involved in DNA replication.Conclusions/Significance: The relative performance of different gene fusion methodologies has not previously been explored. We find that they are largely complementary, with different methods being more or less appropriate in different genomes. Our method is the only one currently available for download and can be run on an arbitrary dataset by the user. The CODA software and datasets are freely available from ftp://ftp.biochem.ucl.ac.uk/pub/gene3d_data/v6.1.0/CODA/. Predictions are also available via web services from http://funcnet.eu/

CiteSeerX

Public Library of Science (PLOS)

Crossref

PubMed Central

UCL Discovery

2D-Qsar for 450 types of amino acid induction peptides with a novel substructure pair descriptor having wider scope

Author: AM Helguera
AN Jain
CE Shannon
DJ Rogers
F Tian
H Akaike
H Nielsen
HE Ahmed
K Udaka
N Majeux
P Finn
P Willett
R Kohavia
R Nilakantan
RD King
RD King
RE Carhart
RP Sheridan
Satoru Miyano
Tsutomu Osoda
Z Zhao
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Quantitative structure-activity relationships (QSAR) analysis of peptides is helpful for designing various types of drugs such as kinase inhibitor or antigen. Capturing various properties of peptides is essential for analyzing two-dimensional QSAR. A descriptor of peptides is an important element for capturing properties. The atom pair holographic (APH) code is designed for the description of peptides and it represents peptides as the combination of thirty-six types of key atoms and their intermediate binding between two key atoms. Results The substructure pair descriptor (SPAD) represents peptides as the combination of forty-nine types of key substructures and the sequence of amino acid residues between two substructures. The size of the key substructures is larger and the length of the sequence is longer than traditional descriptors. Similarity searches on C5a inhibitor data set and kinase inhibitor data set showed that order of inhibitors become three times higher by representing peptides with SPAD, respectively. Comparing scope of each descriptor shows that SPAD captures different properties from APH. Conclusion QSAR/QSPR for peptides is helpful for designing various types of drugs such as kinase inhibitor and antigen. SPAD is a novel and powerful descriptor for various types of peptides. Accuracy of QSAR/QSPR becomes higher by describing peptides with SPAD.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

A Network of SCOP Hidden Markov Models and Its Analysis

Author: A Andreeva
AG Murzin
E Kolaczyk
J Gough
J Gough
J Soding
L Freeman
L Lo Conte
Layne T Watson
Lenwood S Heath
Liqing Zhang
ME Newman
RD Finn
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background The Structural Classification of Proteins (SCOP) database uses a large number of hidden Markov models (HMMs) to represent families and superfamilies composed of proteins that presumably share the same evolutionary origin. However, how the HMMs are related to one another has not been examined before. Results In this work, taking into account the processes used to build the HMMs, we propose a working hypothesis to examine the relationships between HMMs and the families and superfamilies that they represent. Specifically, we perform an all-against-all HMM comparison using the HHsearch program (similar to BLAST) and construct a network where the nodes are HMMs and the edges connect similar HMMs. We hypothesize that the HMMs in a connected component belong to the same family or superfamily more often than expected under a random network connection model. Results show a pattern consistent with this working hypothesis. Moreover, the HMM network possesses features distinctly different from the previously documented biological networks, exemplified by the exceptionally high clustering coefficient and the large number of connected components. Conclusions The current finding may provide guidance in devising computational methods to reduce the degree of overlaps between the HMMs representing the same superfamilies, which may in turn enable more efficient large-scale sequence searches against the database of HMMs.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Domain-Domain Interactions Underlying Herpesvirus-Human Protein-Protein Interaction Networks

Author: A Chatr-aryamontri
A Chatr-aryamontri
A Schlicker
A Stein
B Aranda
B Schuster-Bockler
BD Greenbaum
C Uniprot
D Maglott
E Fossum
EJG Pitman
EV Koonin
EW Verschuren
I Bahir
LM Iyer
MA Calderwood
MD Dyer
RD Finn
RD Finn
Robert Belshaw
S Costa
S Redpath
SF Altschul
SI Yoon
T Driscoll
T Pawson
TM Nye
V Navratil
Y Nakamura
Z Itzhaki
Zohar Itzhaki
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Protein-domains play an important role in mediating protein-protein interactions. Furthermore, the same domain-pairs mediate different interactions in different contexts and in various organisms, and therefore domain-pairs are considered as the building blocks of interactome networks. Here we extend these principles to the host-virus interface and find the domain-pairs that potentially mediate human-herpesvirus interactions. Notably, we find that the same domain-pairs used by other organisms for mediating their interactions underlie statistically significant fractions of human-virus protein inter-interaction networks. Our analysis shows that viral domains tend to interact with human domains that are hubs in the human domain-domain interaction network. This may enable the virus to easily interfere with a variety of mechanisms and processes involving various and different human proteins carrying the relevant hub domain. Comparative genomics analysis provides hints at a molecular mechanism by which the virus acquired some of its interacting domains from its human host

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Expansion of the Protein Repertoire in Newly Explored Environments: Human Gut Microbiome Specific Protein Families

Author: Adam Godzik
AJ Enright
AM Cerdeno-Tarraga
BH Dessailly
C Zmasek
David T. Jones
DC Savage
EC Martens
H Noguchi
J Xu
JA Shipman
John C. Wooley
K Kurokawa
Kyle Ellrott
Lukasz Jaroszewski
N Siew
NC Verberkmoes
PB Eckburg
PJ Turnbaugh
RD Finn
RD Finn
RL Tatusov
S Hunter
S Yooseph
SF Altschul
SF Altschul
SR Eddy
SR Gill
TZ DeSantis
W Li
W Li
Weizhong Li
Publication venue: Public Library of Science
Publication date: 01/06/2010
Field of study

The microbes that inhabit particular environments must be able to perform molecular functions that provide them with a competitive advantage to thrive in those environments. As most molecular functions are performed by proteins and are conserved between related proteins, we can expect that organisms successful in a given environmental niche would contain protein families that are specific for functions that are important in that environment. For instance, the human gut is rich in polysaccharides from the diet or secreted by the host, and is dominated by Bacteroides, whose genomes contain highly expanded repertoire of protein families involved in carbohydrate metabolism. To identify other protein families that are specific to this environment, we investigated the distribution of protein families in the currently available human gut genomic and metagenomic data. Using an automated procedure, we identified a group of protein families strongly overrepresented in the human gut. These not only include many families described previously but also, interestingly, a large group of previously unrecognized protein families, which suggests that we still have much to discover about this environment. The identification and analysis of these families could provide us with new information about an environment critical to our health and well being

Public Library of Science (PLOS)

CiteSeerX

Crossref

Directory of Open Access Journals

PubMed Central

eScholarship - University of California